#code evaluation23/04/2025
AWS Launches SWE-PolyBench: A Multilingual Benchmark to Evaluate AI Coding Agents
AWS AI Labs has launched SWE-PolyBench, an open-source, multilingual benchmark designed to evaluate AI coding agents with real-world coding tasks across multiple languages, improving upon previous limited benchmarks.